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DETAILED ACTION 
Claim Rejections - 35 USC § 103 

1. Claims 1-9, 13, 14, and 16-21 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over ISO/IEC 1 1 172-3 (as described in the prior art section of the 
specification, p. 2, Fig. 1 labeled prior art), hereinafter referred to as Spec_Prior_Art, in 
view of Nakajima et al. ("A Fast Audio Classification from MPEG Coded Data" ICASSP 
'99, vol. 6, May 1999) hereinafter referred to as Nakajima. 

Regarding claim 1, Spec_Prior_Art describes the MPEG1/Audio layer 1 system 
and includes the following: 

. a subband dividing section dividing inputted audio information including a sound 
signal into a plurality of frequency bands ( p. 2, line 15, Fig. 1, item 111); 

. a scaling section calculating a scaling factor, which indicates a multiplying power 
to a reference value, of each subband divided by the subband dividing section into 
each of the frequency bands, and aligning each dynamic range (Fig. 1 , item 112); and 

• a coding processing section compressing and coding an output signal from the 
scaling section by using a MPEG system to output as coded bit stream data (Fig. 1 , 
items 113-115). 

But Spec_Prior_Art does not specifically teach "further including a feature 
detection processing section extracting features of the audio information on the basis of 
the scaling factors outputted from the scaling section." However, the examiner contends 
that this concept was well known in the art, as taught by Nakajima. 
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In the same field of endeavor, Nakajima teaches a method for audio classification 
from MPEG coded data, by processing the sub-band energy levels (§2, the scaling 
factors necessarily correspond to sub-band energy levels). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Spec_Prior_Art by specifically providing the 
features, as taught by Nakajima, because it is well known in the art at the time of 
invention for the purpose of identifying the content of the audio signal being processed 
for marketing, monitoring commercials, improved speech recognition (Kenyon et al. U.S. 
Patent 4,843,562, col. 1), and indexing, browsing, and retrievals from multimedia 
databases (Nakajima, §1 , 1[1 ). 

Regarding claim 2, Spec_Prior_Art in view of Nakajima teaches everything 
claimed, as applied above (see claim 1 ). In addition Nakajima teaches "the feature 
detection processing section includes a means of determining whether or not the audio 
information is of a voice signal interval on the basis of the scaling factors" (§1 , 1J4, 
classified into speech; §2.2, "Music/Speech Characteristics" based on the distribution 
of energy; Fig. 1 and 2, n.b., the amplitude of each histogram corresponds to a sub- 
band level). 



Regarding claim 3, Spec_Prior_Art in view of Nakajima teaches everything 
claimed, as applied above (see claim 1 ). In addition, Nakajima teaches "wherein the 
feature detection processing section includes a means of determining whether or not 
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the audio information is of a soundless signal interval on the basis of the scaling 
factors" (§2.1, silence, if o 2 is smaller than the predetermined threshold). 

Regarding claim 4, this claim has corresponding limitations similar to the 
limitations in claim 1, and those limitations are rejected for the same reasons. In 
addition: "a signal level calculating section inputting thereto the scaling factor of each 
subband outputted from the scaling section, and calculating a signal level 
corresponding to the scaling factor; wherein the feature detection processing section 
extracts features of the audio information on the basis of the signal levels calculated by 
the signal level calculating section" (§2, "Classification Algorithm" Figs. 1 and 2; the 
amplitude of each histogram corresponds to a scaled sub-band level where this 
information is used during feature detection). 

Regarding claim 5, Spec_Prior_Art in view of Nakajima teaches everything 
claimed, as applied above (see claim 4). In addition, Nakajima teaches:' 

. the signal level calculating section inputs thereto the scaling factors in low- 
frequency bands outputted from the scaling section within a predetermined period of 
time to calculate the signal levels (§2, "Classification Algorithm" ; §2.1 , U's 1 and 2, low 
frequency; over segment implies a predetermined interval ); and 

• the feature detection processing section comprises: a calculating means of 
finding a maximum value and a minimum value of the signal levels calculated by the 
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signal level calculating section (§2, certain level of variation; requires the determination 
of min/max--range), and 

. calculating a difference between the maximum value and the minimum value (§2, 
variation); and 

• a determining means of, when the difference value calculated by the calculating 
means is greater than or equal to a predetermined threshold value, determining that 
the audio information is of a voice signal interval, on the other hand, when the 
difference value is less than the threshold value, determining that the audio information 
is of a signal interval except for voice (§2, silence, if o 2 is smaller than the 
predetermined threshold, otherwise the audio information is evaluated for speech, etc). 



Regarding claim 6, Spec_Prior_Art in view of Nakajima teaches everything 
claimed, as applied above (see claim 4). In addition, Nakajima teaches:- 

• the signal level calculating section inputs thereto all of the scaling factors 
outputted from the scaling section within a predetermined period of time to calculate 
the signal levels (§1 , from MPEG coded data; see Figs, 1 and 2); and 

• the feature detection processing section includes a determining means of, when 
the signal levels calculated by the signal level calculating section are greater than or 
equal to a predetermined threshold value (§2, silence if a 2 is smaller than the 
predetermined threshold, otherwise the audio information is evaluated for speech, etc), 

• determining that the audio information is of a sound signal interval (§2.2 
"Music/Speech Characteristics", not silence), 
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. on the other hand, when the signal levels are less than the threshold value, 
determining that the audio information is of a soundless signal interval (§2, silence, if 
a 2 is smaller than the predetermined threshold). 

Regarding claim 7, this claim has limitations similar to claim 1 and is rejected for 
the same reasons. 

Regarding claim 8, this claim has limitations similar to claim 2 and is rejected for 
the same reasons. 

Regarding claim 9, this claim has limitations similar to claim 3 and is rejected for 
the same reasons. 

Regarding claim 13, this claim has limitations similar to claim 1 and is rejected 
for the same reasons. 

Regarding claim 14, this claim has limitations similar to claim 2 and is rejected 
for the same reasons. 

Regarding claim 16, this claim has limitations similar to claim 4 and is rejected 
for the same reasons. 
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Regarding claim 17, this claim has limitations similar to claim 5 and is rejected 
for the same reasons. 

Regarding claim 18, this claim has limitations similar to claim 6 and is rejected 
for the same reasons. 

Regarding claim 19, this claim has limitations similar to claim 7 and is rejected 
for the same reasons. 

Regarding claim 20, this claim has limitations similar to claim 8 and is rejected 
for the same reasons. . 

Regarding claim 21, this claim has limitations similar to claim 9 and is rejected 
for the same reasons. 

2. Claims 10-12, 15, and 22-24 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Spec_Prior_Art in view of Nakajima and well known prior art (MPEP 
2144.03). 

Regarding claim 10, Spec_Prior_Art describes the encoding portion of ISO/IEC 
1 1 172-3, but does not specifically describe "a stream dividing section, after inputting 
thereto bit stream data coded by a MPEG system, dividing the coded bit stream data 
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composed of each subband divided into each frequency band into bit assigning 
information, scaling factor value indicating a multiplying power to a reference value, and 
coded data in units of each subband; and a decoding processing section executing a 
decoding process to the coded data divided by the stream dividing section in units of 
each subband to output as audio information." However, the examiner takes official 
notice of the fact that the use of a decoder for the purpose of decoding data encoded 
according to ISO/IEC 1 1 172-3 was well known in the art. 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Spec_Prior_Art such that a decoder is 
implemented, because a decoder is part of the ISO/IEC 1 1 172-3 specification and 
required for the complete processing of the signal. 

In addition, Spec_Prior_Art does not specifically teach: 

• a feature detection processing section extracting features of the audio 
information on the basis of the scaling factor values outputted from the stream dividing 
section; and 

• a signal level calculating section inputting thereto the scaling factor of each 
subband outputted from the stream dividing section to calculate a signal level; 

. wherein the feature detection processing section extracts features of the audio 
information on the basis of the signal levels calculated by the signal level calculating 
section. 

However, the examiner contends that these concepts were well known in the art, 
as taught by Nakajima. 



Application/Control Number: 10/046,719 Page 9 

Art Unit: 2654 

In the same field of endeavor, Nakajima teaches a method for audio classification 
from MPEG coded data, where Nakajima processes the sub-band energy levels (§2, 
where the scaling factors necessarily correspond to sub-band energy levels since 
Nakajima is processing sub-band energy levels) and performs classification (§2, 
silence, speech, etc.). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Spec_Prior_Art by specifically providing the 
features, as taught by Nakajima, because it is well known in the art at the time of 
invention for the purpose of identifying the content of the audio signal being processed 
for marketing, monitoring commercials, improved speech recognition (Kenyon et al. U.S. 
Patent 4,843,562, col. 1), and indexing, browsing, and retrievals from multimedia 
databases (Nakajima, §1 , 1J1 )• 

Regarding claim 11, Spec_Prior_Art in view of Nakajima and well known prior 
art teaches everything claimed, as applied above (see claim 10). In addition, Nakajima 
further teaches : 

. the signal level calculating section inputs thereto the scaling factors in low- 
frequency bands outputted from the stream dividing section within a predetermined 
period of time to calculate the signal levels (§2, "Classification Algorithm" most of the 
sub-band energy is confined to the lower sub-bands and variations are compared); 
and 
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. the feature detection processing section comprises: a calculating means of 
finding a maximum value and a minimum value of the signal levels calculated by the 
signal level calculating section (§2, variation is determined), and 

• calculating a difference between the maximum value and the minimum value (§2, 
variation is determined with a necessary calculation of min/max difference); and 

• a determining means of, when the difference value calculated by the calculating 
means is greater than or equal to a predetermined threshold value, determining that 
the audio information is of a voice signal interval, on the other hand, when the 
difference value is less than the threshold value, determining that the audio information 
is of a signal (§2.1 , silence, if o 2 is smaller than the predetermined threshold). 

Regarding claim 12, Spec_Prior_Art in view of Nakajima and well known prior 
art teaches everything claimed, as applied above (see claim 10). In addition, Nakajima 
further teaches: 

. the signal level calculating section inputs thereto all of the scaling factors 
outputted from the stream dividing section within a predetermined period of time to 
calculate the signal levels (§2, time and frequency analysis, frames in one second); 
and 

. the feature detection processing section includes a determining means of, when 
the signal levels calculated by the signal level calculating section are greater than or 
equal to a predetermined threshold value (§2.1 , silence if o 2 is smaller than the 
predetermined threshold); 



Application/Control Number: 10/046,719 Page 11 

Art Unit: 2654 

• determining that the audio information is of a sound signal interval (§2, §2.1 , 
"Silence Segment Detection" if not silence necessarily "sound"), 

determining that the audio information is of a sound signal interval, on the other 
hand, when the signal levels are less than the threshold value, determining that the 
audio information is of a soundless signal interval (§2.1, silence, if a 2 is smaller than the 
predetermined threshold). 

Regarding claim 15, this claim has limitations similar to claim 3 and is rejected 
for the same reasons. 

Regarding claim 22, this claim has limitations similar to claim 10 and is rejected 
for the same reasons. 

Regarding claim 23, this claim has limitations similar to claim 1 1 and is rejected 
for the same reasons. 

Regarding claim 24, this claim has limitations similar to claim 12 and is rejected 
for the same reasons. 



Application/Control Number: 10/046,719 Page 12 

Art Unit: 2654 

Citation of Pertinent Art 

3. The following prior art made of record but not relied upon is considered pertinent 
to the applicant's disclosure: 

• Zick et al. (U.S. Patent 6,370,504) disclose an invention that performs speech 
recognition on MPG/Audio encoded files. 

. Patel et al. ("Audio Characterization for Video Indexing" SPIE, 1996, pp. 373- 
384) teaches audio feature extraction from an MPEG encoded audio stream. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to V. Paul Harper whose telephone number is (571) 272- 
7605. The examiner can normally be reached on M-F. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571 ) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



V. Paul Harper 
Patent Examiner 
Art Unit 2654 
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